A Logic Programming View of Relational Morphology
نویسنده
چکیده
We use the more abstract term "relational morphology" in place of tile usual "two-level morphology" in order to emphasize an aspect of Koskenniemi's work which has been overlooked in favor of implementation issues using the finite state paradigm, namely, that a mathematical relation can be specified between the lexical and surface levels of a language. Relations, whether finite state or not, can be computed using any of several paradigms, and we present a logical reading of a notation for relational morphological rules (similar to that of Koskenniemi's) which can in fact be used to automatically generate Prolog program clauses. Like die finite state implementations, the relation can be computed in either direction, either from the surface to the lexieal level, or vice versa. At rite very least, this provides a morphological complement to logic grammars which deal mainly with syntax and semantics, in a programming environment which is more user-friendly than the finite state programming paradigm. The morphological rules often compile simply into unification of the arguments in the generated morphology predicate followed by a recursive call of the said predicate. Further speed can be obtained when a Prolog compiler, rather than an interpreter, is used for execution. In t roduct ion . Kimmo Koskenniemi's so called "twolevel model" of computational morphology (1983) in which phonological rules are implemented as finite state transducers has been the subject of a great deal of attention. The two-level model is based partly on earlier work of Johnson (1972), who considered that a set of "simultaneous" phonological rules could be represented by such a transducer, and of Kaplan and Kay (1983) who thought that ordered generative rules could be implemented as a cascading sequence of such transducers. Koskenniemi in fact implemented the phonological rules by a set of finite state tranducers running in parallel, rather than by a single large finite state machine into which many cascading machines could be combined. Subsequent to Koskenniemi's original work, there was a LISP-based implementation called KIMMO (Kartunnen 1983), and two-level descriptions of English, Rumanian, French and Japanese (Kartunnen and Wittenburg, Khan, Lun, Sasaki Alam 1983). A later LISP based implementation by Dalrymple et al (1987) called DKIMMO/TWOL helped the user by converting two-level rules into finite state transducers: in earlier implementations, and in the recent PC-KIMMO system (Antworth 1990), it was the user's task to generate the machine, s from two-level descriptions. However one very important contribution of Koskeaniemi to morphology, namely the notion that there is a relation between the surface and lexical "levels", has been somewhat overlooked by implementation issues having to do with the couversion of two-level rules into fiuite state automata in the various KIMMO systems. The two-level rules according to this notion, unlike the rules of generative morphology which transform representations from one level to representations in the other, express a correspondence between lexical and surface levels. Furthermore since no directionality is implied ill the definition of a relation, unlike geuerative rules, the same set of two-level rules applies both in going from surface to lexieal levels and vice versa. Rather than being procethtral rules, they are declarative. Consequently, any correct implementation of the two-level rules is a relational program which can be used either analytically or generatively. We will henceforth, in order to emphasize the fact that a relation is being defined by them, refer to relational morphology rules rather than to the mathematically neutral term "two-level rules". Despite the recognition that relational morphology rules are declarative, the main emphasis in using them has been obscured by the original finite state implementation technique. Recently, Bear (1986) has interpreted such rules directly, using Prolog as an implementation language. This, although an improvement on finite state implementations from the point of view of debugging and clarity, still misses an important aspect of relational morphology rules as a declarative notation, namely that if relational morphology rules define a relation between surface and lexieal levels, then that relation can be specified and implemented using any of several different relational programming paradigms. In this paper, we will show that logic programming, which can be viewed as a relational programming paradigm, can be used to give a declarative reading to morphological rules. Further, because of the execution model for logic programs, embodied in various logic programming languages such as Prolog, the declarative reading also has a convenient procedural reading. That is, each relational morphological rule may be thought of as corresponding to or generating a logic program clause. The entire set of logic program clauses generated from the relational morphological rules, coupled with some utility predicates, then constitutes a morphological analyser which can either be used as a stand alone program or which can be coupled as a module to other linguistic tools to build a natural language processing system. Since the roles have been transformed into logic program clauses, they gain in speed of execution over Bear's interpretive method, and further speed can be gained by compiling these clauses using existing Prolog compilers. At the very least, this provides a morphological complement to logic grammars (Abramson and Dahl 1989) which deal mainly with syntax and semantics, in a programming environment which we believe is more user-friendly than the finite state programming paradigm. It may be argued that this is a step backwards from the linear efficiency of finite state processing. However, when ACRES DE COLING-92, N^NTES, 23-28 ̂ o(rr 1992 8 5 0 PRoc. ov COI.ING-92, NANTES, AUG. 23-28, 1992 discussing "efficiency" it is very important to be very precise as to where the efficiency lies and what it consists of. Finite state processing is linear in the sense that a properly implemented finite state machine will be able to decide whether a string of length ,~ is acceptable or not in a time which is O(n), ie, for large enough n, a linear multiple of n. For small values of u, depending on how much bookkeeping has to be done, "finite state algorithms" may perform worse titan algorithms which are formally O(n 2) or higher. Any processing in addition to recognition may involve time factors which are more than linear. This entirely leaves aside the question of the userfriendliness of the finite state computing paradigm, a question of how "efficient" in human terms it is to use finite state methods. Anyone who has trieM to implement finite state automata of substantial size directly (as in Koskenniemi's original implementation, the first KIMMO systems, and KIMMO-PC) will have realised that programming finite state machines is distastefully akin to directly programming Turing machines. A substantial amount of software is necessary iu order to provide a development, debugging and maintenance environment for ea~y use of the finite state computing paradigm. There also remains the theoretical question as to the 'adequacy of finite state morphological descriptions for all, or even most, human languages. However, this is a topic we shall not venture into in this paper. In our method, a relatively small Prolog program generates logic programming clauses from relational morphology rules. The generated clauses (at least in the experiments so far) are readable and it is easy to correlate the generated clause and the original morphological rule, thus promoting debugging. The standard debugging tools of Prolog systems (at the very least, sophisticated tracing facilities) seem sufficient to deal with rule debugging, and the readability of the generated clauses should help in the maintenance and transference of morphological programs. Thus, from the software engineering point of view, logic progrannming is a more sophisticated, higher-level programming paradigm than finite state methods. Also, should finite state descriptions in the end prove inadequate, or even inconvenient, for all of morphology, logic programming provides expressive power for reasonable extension of the notation of relational morphology rules. The current availability of Prolog compilers, even for small machines, provides another increment of speedy execution of the generated programs. Many of the morphological rules produce logic program clauses in which checking of the lexical and surface elements and contexts reduce to unification followed by a recursive call of the morphology predicate. Compiled Prolog abstract machine code for such clauses is usually very compact. Prolog compiler indexing mechanisms often make it possible to access the correct clause to be applied in constant time. Nota t iona l Aspects . Our tableau notation for relational morphology rules is as follows: LexLeft <= Lex => LexRight <:> SarfaccLeft <= Surface => SurfaceRight which expresses the relation between a lexical and a surface uait (Lex aml Surface, respectively), provided that the left and right contexts at both the lexical and surface levels (LexLeft , LexRiglnt, SurfaceLeft , and SuffaceRight) are satisfied. Another kind of relational morphology rule which is allowed is: LexLeft => Lex <= LexRight<:> SurfaceLeft => Surface <= SurfaceRight. which expresses a relationship between Lex and Surface providing that the left and fight contexts at the lexical and surface levels are different from those specified by LexLeft, LexRight, SuffaceLeft, and SurfaceRight. This means that either LexLefl or LexRight is not satisfied, and also that either SurfaceLeft or SurfaceRight is not satisfied. More coutpact notation is also accepted, for example: LexLeft:SurfaceLeft <=Lex:Surface=> LexRight:SurfaceRight. LexLeft:SurfaceLeft =>Lex:Sarface<= LexRight:SurfaceRight. In the case where a pair of lexical and surface contexts are identical, or if the lexical and surface elements are identical, they need not be repeated. Such compressed rules as the following are also allowed: Left <= Element => Right. Left => Element <= Right. Sets of symbols may be specified: set(vowel,[a,e,i,o,u]). Lexical entries may be specified as follows: lexicon:: {eat=noun,root= craps}. lexicon::{cat=noun,root= pimlo}. This feature notation is that used in the author's Definite Feature Grammars (Abramson 1991). Logical reading of relational morphology roles. Corresponding to a set of relational morphology rules, a binary predicate morphology/2 specifies the relation between a lexical and a surface stream of characters: morphology(LexStrcam,S ufface Strcarn). In order to specify the logic program clause which corresponds to a relational morphology rule, we have to manipulate the left and right lexical and surface contexts. We can find the right contexts within LexStream and SurfaceStream, but we have to provide a specification of the left contexts, and we do this by defining the above binary predicate morphology/2 in terms of a quaternary predicate morphology/4: morphology(LexS tmanLSurfaceS tream, LeftLexStream,I.,eftS ur faceStream). ACn'ES DE COIJNG-92, NArcri;s, 23-28 Aofzr 1902 1~ 5 1 P~toc. oF COLING-92. NANTES, A~JG. 23-28, 1992 LeftLexStream and LeftSurfaceStrcam are initially empty and are represented as reverse order lists of the left contexts which have aheady been seen. The top level definition of morphology/2 is: morphology(LexS tremn,Sur faceSt~un) :raorphology(LexStream,SurfaceSu'emn,I I,[1). The fundamental logic program chmse co~reslxmding to a relational morphology rule sneh am LexLeft <= Lex => LexRight<:> SurfaceLeft <= Surface => SurfaceRight.
منابع مشابه
A Dialogue Manager for Accessing Databases
We present a logic programming based dialogue system that enables the access in natural language to the heterogeneous external relational databases of the Évora University. The proposed system has the capability of inferring user attitudes and uses ISCO in order to view the University relational databases as a part of a declarative/deductive object-oriented (with inheritance) database allowing ...
متن کاملEvaluation and Comparison Criteria for Approaches to Probabilistic Relational Knowledge Representation
In the past ten years, the areas of probabilistic inductive logic programming and statistical relational learning put forth a large collection of approaches to combine relational representations of knowledge with probabilistic reasoning. Here, we develop a series of evaluation and comparison criteria for those approaches and focus on the point of view of knowledge representation and reasoning. ...
متن کاملFuzzy Relational Visualization for Decision Support
A study on fuzzy relational visualization in system development aspects is presented. The front-end enables dynamic and scalable changes in visualization according to user’s expertise and inspiration. Integrative management of various data and knowledge is handled by the back-end at any scale in cloud computing environment. Extended Logic Programming is used as the core of fuzzy relational mana...
متن کاملAn Inductive Logic Programming Query Language for Database Mining
First, a short introduction to inductive logic programming and machine learning is presented and then an inductive database mining query language RDM (Relational Database Mining language). RDM integrates concepts from inductive logic programming, constraint logic programming, deductive databases and meta-programming into a flexible environment for relational knowledge discovery in databases. Th...
متن کاملA Materialized View for the Same Generation Query in Deductive Databases
Traditionally, deductive databases are designed as extensions to relational databases by either integrating a logic programming language, such as PROLOG, with a conventional relational database system that provides storage persistence needed for any database system, or by integrating an expert system with a relational database system. Deductive databases take advantage of a special kind of rule...
متن کامل